Rank in Wordlist | Frequency | Word |
---|---|---|
8602 | 17 | %, |
14510 | 7 | 1,5 |
14524 | 7 | 2008,par |
14531 | 7 | 35,78 |
14534 | 7 | 4,5 |
15759 | 6 | %), |
15761 | 6 | 1,7 |
15808 | 6 | 53,86 |
17354 | 5 | 2,5 |
17355 | 5 | 2,5% |
Rank in Wordlist | Frequency | Word |
---|---|---|
19377 | 4 | 2013(PP |
21419 | 4 | l’Homme(CNIDH |
23032 | 3 | Indépendantes(CECI |
23531 | 3 | Recettes(OBR |
23566 | 3 | Réconciliation(CVR |
26850 | 2 | 2010(PP |
26853 | 2 | 2011(Domaine |
26854 | 2 | 2011(PP |
27229 | 2 | Africaine(EAC |
27483 | 2 | Bujumbura(C |
Rank in Wordlist | Frequency | Word |
---|---|---|
8962 | 16 | %) |
9785 | 14 | %). |
15759 | 6 | %), |
22321 | 3 | 0)3 |
22322 | 3 | 0)431 |
26677 | 2 | $) |
26683 | 2 | 000). |
26694 | 2 | 1)Travaux |
26826 | 2 | 2)Atelier |
27939 | 2 | F.P.P). |
Rank in Wordlist | Frequency | Word |
---|---|---|
2769 | 80 | 30% |
3418 | 63 | 90% |
3608 | 59 | 80% |
3799 | 55 | 50% |
4601 | 43 | 100% |
4868 | 40 | 60% |
4869 | 40 | 70% |
5138 | 37 | 10% |
5229 | 36 | 40% |
5529 | 33 | 2% |
Rank in Wordlist | Frequency | Word |
---|---|---|
27526 | 2 | CD&V |
28285 | 2 | Justice&Démocratie |
39404 | 1 | DDH&J |
46880 | 1 | art.7&,2° |
Rank in Wordlist | Frequency | Word |
---|---|---|
17320 | 5 | $US |
26677 | 2 | $) |
26678 | 2 | $USA |
26749 | 2 | 1200$ |
29591 | 2 | US$ |
29592 | 2 | USA$ |
34958 | 1 | $! |
34959 | 1 | $), |
34960 | 1 | $, |
34961 | 1 | $. |
Rank in Wordlist | Frequency | Word |
---|---|---|
439 | 502 | d'un |
521 | 434 | c'est |
630 | 368 | d'une |
882 | 265 | d'autres |
1547 | 153 | qu'il |
2001 | 116 | C'est |
2012 | 116 | l'Université |
2031 | 115 | n'est |
2080 | 112 | qu'ils |
2101 | 110 | l'Organisation |
Rank in Wordlist | Frequency | Word |
---|---|---|
26679 | 2 | %* |
Rank in Wordlist | Frequency | Word |
---|---|---|
2699 | 83 | et/ou |
4100 | 50 | VIH/SIDA |
7777 | 20 | 2/3 |
9402 | 15 | Burundi/Politique |
9403 | 15 | Burundi/Turquie |
11930 | 10 | Burundi/Economie |
12091 | 10 | VIH/Sida |
13271 | 9 | n°1/04 |
13544 | 8 | 3/4 |
16924 | 6 | n°1/08 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots